{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "7m81oxz-sGgM"
      },
      "source": [
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LianjiaTech/BELLE/blob/main/notebook/BELLE_INFER_COLAB.ipynb) "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "p70s1UElROWa"
      },
      "source": [
        "# ** BELLE模型在COLAB推理的示例** \n",
        "这里提供在colab环境运行BELLE模型的代码。默认加载的是4bit量化的BLOOM 7B模型，4bit量化的模型目前效果上面还是会有损失。在模型加载到内存过程中，最高消费RAM大概需要8G，等模型load到GPU中以后，RAM只需要4G，GPU大概需要10G。\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "QUt9JenaRViP"
      },
      "source": [
        "## 查看colab分配的显卡类型，一般免费账户上14G的T4显卡"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "ORaFqtT6QV4c"
      },
      "source": [
        "\n",
        "\n",
        "---\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "BLwfc3zuPqmK",
        "outputId": "976890b7-a042-40a7-c787-df8578585f7d"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Thu Apr 13 09:41:39 2023       \n",
            "+-----------------------------------------------------------------------------+\n",
            "| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |\n",
            "|-------------------------------+----------------------+----------------------+\n",
            "| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n",
            "| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n",
            "|                               |                      |               MIG M. |\n",
            "|===============================+======================+======================|\n",
            "|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |\n",
            "| N/A   68C    P8    11W /  70W |      0MiB / 15360MiB |      0%      Default |\n",
            "|                               |                      |                  N/A |\n",
            "+-------------------------------+----------------------+----------------------+\n",
            "                                                                               \n",
            "+-----------------------------------------------------------------------------+\n",
            "| Processes:                                                                  |\n",
            "|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n",
            "|        ID   ID                                                   Usage      |\n",
            "|=============================================================================|\n",
            "|  No running processes found                                                 |\n",
            "+-----------------------------------------------------------------------------+\n"
          ]
        }
      ],
      "source": [
        "!nvidia-smi"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "w8ZvOYEKRl-N"
      },
      "source": [
        "##  将BELLE项目git clone到colab"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "_zSYeftDDS3l",
        "outputId": "e55cc6c2-30db-4a07-ea58-c473f09b0b11"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Cloning into 'BELLE'...\n",
            "remote: Enumerating objects: 963, done.\u001b[K\n",
            "remote: Counting objects: 100% (512/512), done.\u001b[K\n",
            "remote: Compressing objects: 100% (298/298), done.\u001b[K\n",
            "remote: Total 963 (delta 365), reused 293 (delta 214), pack-reused 451\u001b[K\n",
            "Receiving objects: 100% (963/963), 5.41 MiB | 14.90 MiB/s, done.\n",
            "Resolving deltas: 100% (528/528), done.\n"
          ]
        }
      ],
      "source": [
        "!git clone https://github.com/LianjiaTech/BELLE.git \n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "5u8KiaitR3Yt"
      },
      "source": [
        "### 14G显卡目前只支持量化版本，这里暂时只提供量化版本在colab推理"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "xEzcL3t7DkAW",
        "outputId": "f8a26b35-5a80-4f89-9d18-5f3e5da354c0"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "/content/BELLE/gptq\n"
          ]
        }
      ],
      "source": [
        "%cd BELLE/models/gptq"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "QpGt4F3BSLW-"
      },
      "source": [
        "### 安装gptq环境"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Wd9frauTDx8t",
        "outputId": "391598d9-49aa-4f78-e58a-1d9fefafe808"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
            "Collecting git+https://github.com/huggingface/transformers (from -r requirements.txt (line 4))\n",
            "  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-pqbdyssy\n",
            "  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-pqbdyssy\n",
            "  Resolved https://github.com/huggingface/transformers to commit 7ade6ef7d48906d7cd7a3dcbab5645b4a6c7c82c\n",
            "  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting safetensors==0.3.0\n",
            "  Downloading safetensors-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.2/1.2 MB\u001b[0m \u001b[31m24.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting datasets==2.10.1\n",
            "  Downloading datasets-2.10.1-py3-none-any.whl (469 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m469.0/469.0 kB\u001b[0m \u001b[31m31.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting sentencepiece\n",
            "  Downloading sentencepiece-0.1.98-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m68.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting dill<0.3.7,>=0.3.0\n",
            "  Downloading dill-0.3.6-py3-none-any.whl (110 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m110.5/110.5 kB\u001b[0m \u001b[31m15.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (4.65.0)\n",
            "Collecting huggingface-hub<1.0.0,>=0.2.0\n",
            "  Downloading huggingface_hub-0.13.4-py3-none-any.whl (200 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m200.1/200.1 kB\u001b[0m \u001b[31m26.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting xxhash\n",
            "  Downloading xxhash-3.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m212.2/212.2 kB\u001b[0m \u001b[31m26.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: packaging in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (23.0)\n",
            "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (2.27.1)\n",
            "Collecting aiohttp\n",
            "  Downloading aiohttp-3.8.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m69.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (2023.3.0)\n",
            "Collecting multiprocess\n",
            "  Downloading multiprocess-0.70.14-py39-none-any.whl (132 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m132.9/132.9 kB\u001b[0m \u001b[31m17.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting responses<0.19\n",
            "  Downloading responses-0.18.0-py3-none-any.whl (38 kB)\n",
            "Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (9.0.0)\n",
            "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (6.0)\n",
            "Requirement already satisfied: pandas in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (1.5.3)\n",
            "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.9/dist-packages (from datasets==2.10.1->-r requirements.txt (line 2)) (1.22.4)\n",
            "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.9/dist-packages (from transformers==4.29.0.dev0->-r requirements.txt (line 4)) (2022.10.31)\n",
            "Requirement already satisfied: filelock in /usr/local/lib/python3.9/dist-packages (from transformers==4.29.0.dev0->-r requirements.txt (line 4)) (3.11.0)\n",
            "Collecting tokenizers!=0.11.3,<0.14,>=0.11.1\n",
            "  Downloading tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m106.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.9/dist-packages (from aiohttp->datasets==2.10.1->-r requirements.txt (line 2)) (2.0.12)\n",
            "Collecting multidict<7.0,>=4.5\n",
            "  Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m114.2/114.2 kB\u001b[0m \u001b[31m14.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting aiosignal>=1.1.2\n",
            "  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)\n",
            "Collecting frozenlist>=1.1.1\n",
            "  Downloading frozenlist-1.3.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (158 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m158.8/158.8 kB\u001b[0m \u001b[31m21.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting async-timeout<5.0,>=4.0.0a3\n",
            "  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)\n",
            "Collecting yarl<2.0,>=1.0\n",
            "  Downloading yarl-1.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m264.6/264.6 kB\u001b[0m \u001b[31m25.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.9/dist-packages (from aiohttp->datasets==2.10.1->-r requirements.txt (line 2)) (22.2.0)\n",
            "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.9/dist-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets==2.10.1->-r requirements.txt (line 2)) (4.5.0)\n",
            "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->datasets==2.10.1->-r requirements.txt (line 2)) (3.4)\n",
            "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->datasets==2.10.1->-r requirements.txt (line 2)) (2022.12.7)\n",
            "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->datasets==2.10.1->-r requirements.txt (line 2)) (1.26.15)\n",
            "Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/dist-packages (from pandas->datasets==2.10.1->-r requirements.txt (line 2)) (2.8.2)\n",
            "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas->datasets==2.10.1->-r requirements.txt (line 2)) (2022.7.1)\n",
            "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.8.1->pandas->datasets==2.10.1->-r requirements.txt (line 2)) (1.16.0)\n",
            "Building wheels for collected packages: transformers\n",
            "  Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for transformers: filename=transformers-4.29.0.dev0-py3-none-any.whl size=6928021 sha256=c39131f292b1f69fde9405dc40b7f4ed534a8e1ca2b69c04a1f4b3465ffeaa97\n",
            "  Stored in directory: /tmp/pip-ephem-wheel-cache-s259wrvv/wheels/14/a0/7b/8f6b25ba4110aa215fcb8d6aedd6cd4f9b9b6619190999ac2b\n",
            "Successfully built transformers\n",
            "Installing collected packages: tokenizers, sentencepiece, safetensors, xxhash, multidict, frozenlist, dill, async-timeout, yarl, responses, multiprocess, huggingface-hub, aiosignal, transformers, aiohttp, datasets\n",
            "Successfully installed aiohttp-3.8.4 aiosignal-1.3.1 async-timeout-4.0.2 datasets-2.10.1 dill-0.3.6 frozenlist-1.3.3 huggingface-hub-0.13.4 multidict-6.0.4 multiprocess-0.70.14 responses-0.18.0 safetensors-0.3.0 sentencepiece-0.1.98 tokenizers-0.13.3 transformers-4.29.0.dev0 xxhash-3.2.0 yarl-1.8.2\n"
          ]
        }
      ],
      "source": [
        "!pip install -r requirements.txt"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "wT0Gq7tkEWs6",
        "outputId": "8dde5328-0a68-4af7-ef28-4e21433b21e9"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "running install\n",
            "/usr/local/lib/python3.9/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.\n",
            "  warnings.warn(\n",
            "/usr/local/lib/python3.9/dist-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.\n",
            "  warnings.warn(\n",
            "running bdist_egg\n",
            "running egg_info\n",
            "creating quant_cuda.egg-info\n",
            "writing quant_cuda.egg-info/PKG-INFO\n",
            "writing dependency_links to quant_cuda.egg-info/dependency_links.txt\n",
            "writing top-level names to quant_cuda.egg-info/top_level.txt\n",
            "writing manifest file 'quant_cuda.egg-info/SOURCES.txt'\n",
            "/usr/local/lib/python3.9/dist-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.\n",
            "  warnings.warn(msg.format('we could not find ninja.'))\n",
            "reading manifest file 'quant_cuda.egg-info/SOURCES.txt'\n",
            "writing manifest file 'quant_cuda.egg-info/SOURCES.txt'\n",
            "installing library code to build/bdist.linux-x86_64/egg\n",
            "running install_lib\n",
            "running build_ext\n",
            "/usr/local/lib/python3.9/dist-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 11.8\n",
            "  warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')\n",
            "building 'quant_cuda' extension\n",
            "creating build\n",
            "creating build/temp.linux-x86_64-3.9\n",
            "x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.9/dist-packages/torch/include -I/usr/local/lib/python3.9/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.9/dist-packages/torch/include/TH -I/usr/local/lib/python3.9/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.9 -c quant_cuda.cpp -o build/temp.linux-x86_64-3.9/quant_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17\n",
            "/usr/local/cuda/bin/nvcc -I/usr/local/lib/python3.9/dist-packages/torch/include -I/usr/local/lib/python3.9/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.9/dist-packages/torch/include/TH -I/usr/local/lib/python3.9/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.9 -c quant_cuda_kernel.cu -o build/temp.linux-x86_64-3.9/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17\n",
            "\u001b[01m\u001b[0m\u001b[01m/usr/local/lib/python3.9/dist-packages/torch/include/c10/util/irange.h(54)\u001b[0m: \u001b[01;35mwarning\u001b[0m #186-D: pointless comparison of unsigned integer with zero\n",
            "          detected during:\n",
            "            instantiation of \u001b[01m\"__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator==(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=size_t, one_sided=false, <unnamed>=0]\"\u001b[0m \u001b[32m\n",
            "(61): here\u001b[0m\n",
            "            instantiation of \u001b[01m\"__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator!=(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=size_t, one_sided=false, <unnamed>=0]\"\u001b[0m \u001b[32m\n",
            "/usr/local/lib/python3.9/dist-packages/torch/include/c10/core/TensorImpl.h(77): here\u001b[0m\n",
            "\n",
            "\u001b[01m\u001b[0m\u001b[01m/usr/local/lib/python3.9/dist-packages/torch/include/c10/util/irange.h(54)\u001b[0m: \u001b[01;35mwarning\u001b[0m #186-D: pointless comparison of unsigned integer with zero\n",
            "          detected during:\n",
            "            instantiation of \u001b[01m\"__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator==(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]\"\u001b[0m \u001b[32m\n",
            "(61): here\u001b[0m\n",
            "            instantiation of \u001b[01m\"__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator!=(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]\"\u001b[0m \u001b[32m\n",
            "/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/qualified_name.h(73): here\u001b[0m\n",
            "\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:41:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kat::DeprecatedTypeProperties& at::Tensor::type() const\u001b[m\u001b[K’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                         \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:222:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  222 | \u001b[01;36m\u001b[K  De\u001b[m\u001b[KprecatedTypeProperties & type() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1011:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1032:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1056:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1083:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1106:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:1999:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:2020:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:2043:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:2069:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:123:2092:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  123 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:41:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kat::DeprecatedTypeProperties& at::Tensor::type() const\u001b[m\u001b[K’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                         \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:222:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  222 | \u001b[01;36m\u001b[K  De\u001b[m\u001b[KprecatedTypeProperties & type() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1011:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1032:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1056:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1083:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1106:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:1999:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:2020:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:2043:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:2069:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:218:2092:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  218 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:41:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kat::DeprecatedTypeProperties& at::Tensor::type() const\u001b[m\u001b[K’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                         \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:222:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  222 | \u001b[01;36m\u001b[K  De\u001b[m\u001b[KprecatedTypeProperties & type() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1011:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1032:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1056:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1083:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1106:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:1999:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:2020:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:2043:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:2069:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:377:2092:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  377 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:41:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kat::DeprecatedTypeProperties& at::Tensor::type() const\u001b[m\u001b[K’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                         \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:222:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  222 | \u001b[01;36m\u001b[K  De\u001b[m\u001b[KprecatedTypeProperties & type() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:163:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[Kc10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)\u001b[m\u001b[K’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/Dispatch.h:122:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  122 | \u001b[01;36m\u001b[Kinline at::\u001b[m\u001b[KScalarType scalar_type(const at::DeprecatedTypeProperties& t) {\n",
            "      | \u001b[01;36m\u001b[K^~~~~~~~~~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1011:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1032:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1056:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1083:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = double]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1106:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:\u001b[m\u001b[K In lambda function:\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:1999:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:2020:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:2043:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:2069:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = float]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[Kquant_cuda_kernel.cu:464:2092:\u001b[m\u001b[K \u001b[01;35m\u001b[Kwarning: \u001b[m\u001b[K‘\u001b[01m\u001b[KT* at::Tensor::data() const [with T = int]\u001b[m\u001b[K’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [\u001b[01;35m\u001b[K-Wdeprecated-declarations\u001b[m\u001b[K]\n",
            "  464 |   AT_DISPATCH_FLOATING_TYPES(\n",
            "      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            \u001b[01;35m\u001b[K^\u001b[m\u001b[K\n",
            "\u001b[01m\u001b[K/usr/local/lib/python3.9/dist-packages/torch/include/ATen/core/TensorBody.h:244:1:\u001b[m\u001b[K \u001b[01;36m\u001b[Knote: \u001b[m\u001b[Kdeclared here\n",
            "  244 | \u001b[01;36m\u001b[K  T \u001b[m\u001b[K* data() const {\n",
            "      | \u001b[01;36m\u001b[K^\u001b[m\u001b[K \u001b[01;36m\u001b[K~~\u001b[m\u001b[K\n",
            "creating build/lib.linux-x86_64-3.9\n",
            "x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,-Bsymbolic-functions -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.9/quant_cuda.o build/temp.linux-x86_64-3.9/quant_cuda_kernel.o -L/usr/local/lib/python3.9/dist-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.9/quant_cuda.cpython-39-x86_64-linux-gnu.so\n",
            "creating build/bdist.linux-x86_64\n",
            "creating build/bdist.linux-x86_64/egg\n",
            "copying build/lib.linux-x86_64-3.9/quant_cuda.cpython-39-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg\n",
            "creating stub loader for quant_cuda.cpython-39-x86_64-linux-gnu.so\n",
            "byte-compiling build/bdist.linux-x86_64/egg/quant_cuda.py to quant_cuda.cpython-39.pyc\n",
            "creating build/bdist.linux-x86_64/egg/EGG-INFO\n",
            "copying quant_cuda.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO\n",
            "copying quant_cuda.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n",
            "copying quant_cuda.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n",
            "copying quant_cuda.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n",
            "writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt\n",
            "zip_safe flag not set; analyzing archive contents...\n",
            "__pycache__.quant_cuda.cpython-39: module references __file__\n",
            "creating dist\n",
            "creating 'dist/quant_cuda-0.0.0-py3.9-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it\n",
            "removing 'build/bdist.linux-x86_64/egg' (and everything under it)\n",
            "Processing quant_cuda-0.0.0-py3.9-linux-x86_64.egg\n",
            "creating /usr/local/lib/python3.9/dist-packages/quant_cuda-0.0.0-py3.9-linux-x86_64.egg\n",
            "Extracting quant_cuda-0.0.0-py3.9-linux-x86_64.egg to /usr/local/lib/python3.9/dist-packages\n",
            "Adding quant-cuda 0.0.0 to easy-install.pth file\n",
            "\n",
            "Installed /usr/local/lib/python3.9/dist-packages/quant_cuda-0.0.0-py3.9-linux-x86_64.egg\n",
            "Processing dependencies for quant-cuda==0.0.0\n",
            "Finished processing dependencies for quant-cuda==0.0.0\n",
            "Benchmarking LLaMa-7B FC2 matvec ...\n",
            "FP16: 0.0006610641479492188\n",
            "2bit: 0.0011321241855621338\n",
            "3bit: 0.000905475378036499\n",
            "4bit: 0.0008348245620727539\n",
            "8bit: 0.0006743667125701904\n",
            "Verifiying kernel correctness ...\n",
            "2bit Simu: tensor([[[-0.5031,  0.2357, -0.6711,  ..., -0.6525, -0.8541, -1.1861],\n",
            "         [ 0.0038, -0.1658, -0.3461,  ..., -0.4995, -0.0135,  0.5206],\n",
            "         [ 0.2579, -0.0474,  0.5508,  ...,  0.0792, -0.4441,  0.0668],\n",
            "         ...,\n",
            "         [-0.5598, -0.1225, -0.1789,  ..., -0.4645,  0.1437, -0.4854],\n",
            "         [-0.2219, -0.1854,  1.2809,  ..., -0.0998,  0.6955,  1.1353],\n",
            "         [ 0.2213,  0.9016, -0.2213,  ..., -0.0833, -0.6237,  0.7751]],\n",
            "\n",
            "        [[ 0.3603,  0.2940, -0.0782,  ...,  0.4057, -0.2398, -0.0825],\n",
            "         [ 0.4358, -0.2107, -0.4587,  ..., -0.4773,  0.6190,  0.1566],\n",
            "         [ 0.3579,  0.5932, -0.2014,  ..., -0.0768,  0.7605,  1.2279],\n",
            "         ...,\n",
            "         [-1.1553, -0.2112, -0.5064,  ...,  0.0082, -0.0682,  0.2376],\n",
            "         [ 0.7963,  0.1977, -0.1137,  ..., -0.5868, -0.6299, -0.1280],\n",
            "         [-0.0064, -0.0386, -0.2689,  ..., -0.2614,  1.0313,  0.0985]],\n",
            "\n",
            "        [[-0.3863,  0.1260,  0.3173,  ..., -0.0468,  0.3026,  0.3202],\n",
            "         [ 1.1275,  0.2991, -0.3274,  ...,  0.5410, -0.0722,  0.4927],\n",
            "         [ 0.3988,  0.9053, -0.4587,  ...,  0.1677, -0.0551,  0.4146],\n",
            "         ...,\n",
            "         [-1.0412,  0.7489,  0.4134,  ..., -0.5603, -0.3231,  0.3777],\n",
            "         [-0.1745,  0.4199,  0.6054,  ..., -0.0105,  0.5054,  0.2877],\n",
            "         [-0.0217, -0.1290, -0.9906,  ..., -0.1895, -0.6119,  0.4717]],\n",
            "\n",
            "        [[ 0.3545,  0.0510, -0.9710,  ...,  0.4090,  0.6559, -0.1028],\n",
            "         [-0.9271,  0.2109,  0.2578,  ..., -0.5058,  0.0775, -0.3850],\n",
            "         [-0.1065, -0.6438, -0.2291,  ...,  0.1805,  0.7880, -0.6970],\n",
            "         ...,\n",
            "         [-0.0583,  0.6332,  0.2745,  ..., -0.7127, -0.5598,  0.5953],\n",
            "         [-0.4749, -0.2513, -1.3746,  ..., -0.2019, -0.1938,  0.1304],\n",
            "         [-0.5104, -0.6171,  0.2718,  ...,  0.3684, -0.6506, -0.1029]],\n",
            "\n",
            "        [[-0.6974, -0.9570,  0.4948,  ...,  0.4633,  0.3876,  0.3986],\n",
            "         [ 0.5284,  0.2018,  0.1197,  ...,  0.3437,  0.1782, -0.0771],\n",
            "         [ 0.8205, -0.0388, -0.3899,  ..., -0.7655,  1.6889,  0.0526],\n",
            "         ...,\n",
            "         [ 0.0445,  1.0198,  0.1388,  ..., -0.8089,  0.7321,  0.0123],\n",
            "         [ 1.8570, -0.1649, -0.4458,  ...,  0.3121, -0.2786,  1.3564],\n",
            "         [-0.5506, -0.2635,  0.7586,  ...,  0.0220, -0.3207, -0.2966]]],\n",
            "       device='cuda:0')\n",
            "2bit Kern: tensor([[[-0.5031,  0.2357, -0.6711,  ..., -0.6525, -0.8541, -1.1861],\n",
            "         [ 0.0038, -0.1658, -0.3461,  ..., -0.4995, -0.0135,  0.5206],\n",
            "         [ 0.2579, -0.0474,  0.5508,  ...,  0.0792, -0.4441,  0.0668],\n",
            "         ...,\n",
            "         [-0.5598, -0.1225, -0.1789,  ..., -0.4645,  0.1437, -0.4854],\n",
            "         [-0.2219, -0.1854,  1.2809,  ..., -0.0998,  0.6955,  1.1353],\n",
            "         [ 0.2213,  0.9016, -0.2213,  ..., -0.0833, -0.6237,  0.7751]],\n",
            "\n",
            "        [[ 0.3603,  0.2940, -0.0782,  ...,  0.4057, -0.2398, -0.0825],\n",
            "         [ 0.4358, -0.2107, -0.4587,  ..., -0.4773,  0.6190,  0.1566],\n",
            "         [ 0.3579,  0.5932, -0.2014,  ..., -0.0768,  0.7605,  1.2279],\n",
            "         ...,\n",
            "         [-1.1553, -0.2112, -0.5064,  ...,  0.0082, -0.0682,  0.2376],\n",
            "         [ 0.7963,  0.1977, -0.1137,  ..., -0.5868, -0.6299, -0.1280],\n",
            "         [-0.0064, -0.0386, -0.2689,  ..., -0.2614,  1.0313,  0.0985]],\n",
            "\n",
            "        [[-0.3863,  0.1260,  0.3173,  ..., -0.0468,  0.3026,  0.3202],\n",
            "         [ 1.1275,  0.2991, -0.3274,  ...,  0.5410, -0.0722,  0.4927],\n",
            "         [ 0.3988,  0.9053, -0.4587,  ...,  0.1677, -0.0551,  0.4146],\n",
            "         ...,\n",
            "         [-1.0412,  0.7489,  0.4134,  ..., -0.5603, -0.3231,  0.3777],\n",
            "         [-0.1745,  0.4199,  0.6054,  ..., -0.0105,  0.5054,  0.2877],\n",
            "         [-0.0217, -0.1290, -0.9906,  ..., -0.1895, -0.6119,  0.4717]],\n",
            "\n",
            "        [[ 0.3545,  0.0510, -0.9710,  ...,  0.4090,  0.6559, -0.1028],\n",
            "         [-0.9271,  0.2109,  0.2578,  ..., -0.5058,  0.0775, -0.3850],\n",
            "         [-0.1065, -0.6438, -0.2291,  ...,  0.1805,  0.7880, -0.6970],\n",
            "         ...,\n",
            "         [-0.0583,  0.6332,  0.2745,  ..., -0.7127, -0.5598,  0.5953],\n",
            "         [-0.4749, -0.2513, -1.3746,  ..., -0.2019, -0.1938,  0.1304],\n",
            "         [-0.5104, -0.6171,  0.2718,  ...,  0.3684, -0.6506, -0.1029]],\n",
            "\n",
            "        [[-0.6974, -0.9570,  0.4948,  ...,  0.4633,  0.3876,  0.3986],\n",
            "         [ 0.5284,  0.2018,  0.1197,  ...,  0.3437,  0.1782, -0.0771],\n",
            "         [ 0.8205, -0.0388, -0.3899,  ..., -0.7655,  1.6889,  0.0526],\n",
            "         ...,\n",
            "         [ 0.0445,  1.0198,  0.1388,  ..., -0.8089,  0.7321,  0.0123],\n",
            "         [ 1.8570, -0.1649, -0.4458,  ...,  0.3121, -0.2786,  1.3564],\n",
            "         [-0.5506, -0.2635,  0.7586,  ...,  0.0220, -0.3207, -0.2966]]],\n",
            "       device='cuda:0')\n",
            "\n",
            "\n",
            "3bit Simu: tensor([[[-2.6905e-01, -1.5902e-01,  9.0068e-01,  ..., -6.6669e-01,\n",
            "          -1.0632e-01,  8.3940e-02],\n",
            "         [-1.9759e-01, -1.0823e-01, -1.1897e+00,  ...,  8.0771e-01,\n",
            "          -2.2273e-01,  5.8908e-01],\n",
            "         [-8.3896e-01,  1.3294e+00, -5.5757e-01,  ..., -1.3583e+00,\n",
            "           3.4895e-01,  6.4654e-01],\n",
            "         ...,\n",
            "         [-1.2204e-01, -4.0897e-01, -1.8123e-01,  ..., -5.1525e-01,\n",
            "          -1.1903e+00,  9.8983e-01],\n",
            "         [-9.6853e-01, -2.7724e-02, -5.2704e-02,  ...,  4.3679e-01,\n",
            "           1.4726e-01, -3.9800e-01],\n",
            "         [ 1.6033e-01, -1.2784e+00,  6.7417e-01,  ...,  1.0639e-01,\n",
            "           3.4901e-01,  5.3523e-01]],\n",
            "\n",
            "        [[ 3.6894e-01, -4.5176e-01,  1.2197e-01,  ...,  3.4895e-02,\n",
            "          -4.3593e-01, -5.4872e-01],\n",
            "         [ 7.3333e-02,  8.8676e-01,  4.7564e-01,  ..., -5.2754e-01,\n",
            "          -2.4349e-02, -1.5780e-01],\n",
            "         [ 4.2421e-01, -1.9107e-01, -5.3282e-01,  ...,  6.1283e-01,\n",
            "           2.5989e-01, -1.2081e+00],\n",
            "         ...,\n",
            "         [-1.0019e-01, -5.7981e-01,  2.8965e-02,  ..., -4.4218e-01,\n",
            "           9.1704e-01,  1.6721e-01],\n",
            "         [-4.0398e-01,  1.1743e+00,  3.6515e-01,  ...,  7.9155e-01,\n",
            "           1.9716e-01,  8.2481e-01],\n",
            "         [-2.5617e-02,  2.3317e-01,  1.0999e+00,  ...,  1.4933e-01,\n",
            "           5.7142e-02, -1.5906e-01]],\n",
            "\n",
            "        [[-5.9735e-02,  4.1035e-01, -1.3762e-01,  ..., -3.0711e-01,\n",
            "          -9.4239e-01,  7.3383e-02],\n",
            "         [ 1.8717e-01,  8.9837e-01, -1.1502e+00,  ...,  2.1802e-01,\n",
            "           2.9636e-02, -3.2209e-01],\n",
            "         [-6.9481e-01,  9.0093e-01, -6.3316e-01,  ...,  1.1667e-01,\n",
            "           5.6763e-03,  1.5465e+00],\n",
            "         ...,\n",
            "         [-1.6929e-01, -2.0358e-01, -2.7423e-01,  ...,  1.8102e-01,\n",
            "           7.7124e-01, -3.1629e-01],\n",
            "         [ 1.3252e-01,  7.3445e-01,  1.1723e+00,  ...,  3.0763e-01,\n",
            "           1.3674e-01, -3.4101e-01],\n",
            "         [-2.1739e-01,  2.0022e-01,  2.8911e-01,  ..., -9.8774e-01,\n",
            "           4.8584e-02, -6.7717e-01]],\n",
            "\n",
            "        [[ 1.0280e+00,  5.0743e-01,  7.7207e-02,  ..., -1.5022e-01,\n",
            "          -9.3268e-01, -4.8927e-01],\n",
            "         [ 2.9743e-01,  3.9169e-01,  2.9023e-01,  ..., -2.2965e-01,\n",
            "           6.4456e-01, -2.2597e-01],\n",
            "         [-5.5956e-01,  1.0302e+00, -3.2039e-01,  ..., -4.0319e-02,\n",
            "          -7.5191e-01, -3.1234e-01],\n",
            "         ...,\n",
            "         [ 1.3528e-01, -5.1675e-01, -1.1362e-01,  ..., -5.0100e-01,\n",
            "          -7.3770e-01, -4.7298e-01],\n",
            "         [-3.6730e-01, -1.4140e-01, -1.0335e-01,  ..., -8.7805e-02,\n",
            "          -6.1361e-01, -6.4219e-01],\n",
            "         [-1.5064e-01, -1.6420e+00, -1.1019e+00,  ...,  3.6107e-01,\n",
            "           1.2783e-01, -2.6940e-02]],\n",
            "\n",
            "        [[-4.2673e-01, -3.3197e-01, -9.6817e-02,  ..., -2.2308e-01,\n",
            "           4.3729e-01,  3.0703e-01],\n",
            "         [-2.7142e-01,  1.3038e+00,  3.9566e-01,  ..., -5.7230e-01,\n",
            "           4.5619e-01, -7.7784e-01],\n",
            "         [ 7.5809e-01,  5.4488e-02,  2.8663e-01,  ..., -4.5057e-02,\n",
            "          -3.2061e-01,  1.0343e-01],\n",
            "         ...,\n",
            "         [ 1.3699e+00, -7.2023e-01, -4.6268e-01,  ...,  3.8287e-01,\n",
            "           3.9838e-01, -4.0939e-01],\n",
            "         [-3.6179e-01, -1.4703e-01, -1.1499e-03,  ..., -1.1688e+00,\n",
            "           2.2320e-01,  4.0634e-01],\n",
            "         [ 2.1467e-01,  5.5898e-01, -5.3386e-01,  ..., -1.6671e-01,\n",
            "          -5.3884e-01,  7.5140e-01]]], device='cuda:0')\n",
            "3bit Kern: tensor([[[-2.6905e-01, -1.5902e-01,  9.0068e-01,  ..., -6.6669e-01,\n",
            "          -1.0632e-01,  8.3940e-02],\n",
            "         [-1.9759e-01, -1.0823e-01, -1.1897e+00,  ...,  8.0771e-01,\n",
            "          -2.2273e-01,  5.8908e-01],\n",
            "         [-8.3896e-01,  1.3294e+00, -5.5757e-01,  ..., -1.3583e+00,\n",
            "           3.4895e-01,  6.4654e-01],\n",
            "         ...,\n",
            "         [-1.2205e-01, -4.0897e-01, -1.8123e-01,  ..., -5.1525e-01,\n",
            "          -1.1903e+00,  9.8983e-01],\n",
            "         [-9.6854e-01, -2.7724e-02, -5.2705e-02,  ...,  4.3679e-01,\n",
            "           1.4726e-01, -3.9800e-01],\n",
            "         [ 1.6033e-01, -1.2784e+00,  6.7417e-01,  ...,  1.0639e-01,\n",
            "           3.4901e-01,  5.3523e-01]],\n",
            "\n",
            "        [[ 3.6894e-01, -4.5176e-01,  1.2197e-01,  ...,  3.4895e-02,\n",
            "          -4.3593e-01, -5.4872e-01],\n",
            "         [ 7.3333e-02,  8.8676e-01,  4.7564e-01,  ..., -5.2754e-01,\n",
            "          -2.4349e-02, -1.5780e-01],\n",
            "         [ 4.2421e-01, -1.9107e-01, -5.3282e-01,  ...,  6.1283e-01,\n",
            "           2.5989e-01, -1.2081e+00],\n",
            "         ...,\n",
            "         [-1.0019e-01, -5.7981e-01,  2.8965e-02,  ..., -4.4218e-01,\n",
            "           9.1704e-01,  1.6721e-01],\n",
            "         [-4.0398e-01,  1.1743e+00,  3.6516e-01,  ...,  7.9155e-01,\n",
            "           1.9716e-01,  8.2481e-01],\n",
            "         [-2.5616e-02,  2.3317e-01,  1.0999e+00,  ...,  1.4933e-01,\n",
            "           5.7142e-02, -1.5906e-01]],\n",
            "\n",
            "        [[-5.9735e-02,  4.1035e-01, -1.3762e-01,  ..., -3.0711e-01,\n",
            "          -9.4239e-01,  7.3382e-02],\n",
            "         [ 1.8717e-01,  8.9837e-01, -1.1502e+00,  ...,  2.1802e-01,\n",
            "           2.9637e-02, -3.2210e-01],\n",
            "         [-6.9480e-01,  9.0093e-01, -6.3316e-01,  ...,  1.1667e-01,\n",
            "           5.6764e-03,  1.5465e+00],\n",
            "         ...,\n",
            "         [-1.6929e-01, -2.0358e-01, -2.7423e-01,  ...,  1.8102e-01,\n",
            "           7.7124e-01, -3.1629e-01],\n",
            "         [ 1.3252e-01,  7.3445e-01,  1.1723e+00,  ...,  3.0763e-01,\n",
            "           1.3674e-01, -3.4100e-01],\n",
            "         [-2.1739e-01,  2.0022e-01,  2.8911e-01,  ..., -9.8774e-01,\n",
            "           4.8583e-02, -6.7717e-01]],\n",
            "\n",
            "        [[ 1.0280e+00,  5.0743e-01,  7.7207e-02,  ..., -1.5022e-01,\n",
            "          -9.3268e-01, -4.8927e-01],\n",
            "         [ 2.9743e-01,  3.9169e-01,  2.9023e-01,  ..., -2.2965e-01,\n",
            "           6.4456e-01, -2.2597e-01],\n",
            "         [-5.5956e-01,  1.0302e+00, -3.2039e-01,  ..., -4.0319e-02,\n",
            "          -7.5191e-01, -3.1234e-01],\n",
            "         ...,\n",
            "         [ 1.3528e-01, -5.1676e-01, -1.1363e-01,  ..., -5.0099e-01,\n",
            "          -7.3770e-01, -4.7298e-01],\n",
            "         [-3.6730e-01, -1.4140e-01, -1.0335e-01,  ..., -8.7805e-02,\n",
            "          -6.1361e-01, -6.4219e-01],\n",
            "         [-1.5064e-01, -1.6420e+00, -1.1019e+00,  ...,  3.6107e-01,\n",
            "           1.2783e-01, -2.6940e-02]],\n",
            "\n",
            "        [[-4.2673e-01, -3.3197e-01, -9.6817e-02,  ..., -2.2308e-01,\n",
            "           4.3729e-01,  3.0703e-01],\n",
            "         [-2.7142e-01,  1.3038e+00,  3.9566e-01,  ..., -5.7230e-01,\n",
            "           4.5619e-01, -7.7784e-01],\n",
            "         [ 7.5809e-01,  5.4488e-02,  2.8663e-01,  ..., -4.5057e-02,\n",
            "          -3.2061e-01,  1.0343e-01],\n",
            "         ...,\n",
            "         [ 1.3700e+00, -7.2023e-01, -4.6268e-01,  ...,  3.8287e-01,\n",
            "           3.9838e-01, -4.0939e-01],\n",
            "         [-3.6179e-01, -1.4703e-01, -1.1501e-03,  ..., -1.1688e+00,\n",
            "           2.2320e-01,  4.0634e-01],\n",
            "         [ 2.1467e-01,  5.5898e-01, -5.3386e-01,  ..., -1.6671e-01,\n",
            "          -5.3884e-01,  7.5140e-01]]], device='cuda:0')\n",
            "\n",
            "\n",
            "4bit Simu: tensor([[[ 0.3409,  0.1163, -0.0713,  ..., -1.2378, -0.2355, -0.9327],\n",
            "         [-0.3185, -0.3621, -0.7409,  ..., -0.1507, -1.1744,  0.5434],\n",
            "         [ 0.0519,  0.8851,  0.6816,  ...,  0.0752, -0.6438,  1.1702],\n",
            "         ...,\n",
            "         [ 0.2462,  1.5813, -0.5947,  ...,  0.4940, -0.2779, -0.1576],\n",
            "         [ 0.2529,  0.3197,  0.2337,  ...,  0.3987, -1.5286,  0.3646],\n",
            "         [-0.9714,  0.2794, -0.1031,  ..., -1.0960,  0.3168, -0.7899]],\n",
            "\n",
            "        [[-0.5303,  0.1734, -0.0591,  ...,  0.9764,  0.2096,  0.4706],\n",
            "         [ 0.2493,  0.4132,  0.7467,  ...,  0.3129, -0.6910,  0.2435],\n",
            "         [-0.4530,  0.2228, -0.2907,  ..., -0.2217,  0.2605,  0.2304],\n",
            "         ...,\n",
            "         [-0.9120, -0.5751, -0.1884,  ..., -1.0017,  0.3089,  1.2168],\n",
            "         [ 0.2820,  0.2496,  0.5855,  ..., -0.4431, -0.5884,  0.1653],\n",
            "         [ 1.4851, -0.3262,  0.6374,  ..., -0.3950,  0.1002,  0.2286]],\n",
            "\n",
            "        [[-0.6246, -0.3909,  0.6599,  ..., -0.1083,  0.4711,  0.2585],\n",
            "         [-0.8347,  0.8995, -0.0796,  ..., -0.5707, -0.2438, -0.0074],\n",
            "         [-0.5048, -1.6626,  0.7788,  ...,  0.2924, -0.2436, -0.0927],\n",
            "         ...,\n",
            "         [ 0.0874, -0.9244,  0.5524,  ..., -0.3129, -0.4003,  0.4855],\n",
            "         [-0.2516,  0.5097,  0.0160,  ..., -0.9318, -0.7483,  0.2749],\n",
            "         [-0.1661, -0.0682, -0.1576,  ..., -0.4610,  1.2449,  0.4009]],\n",
            "\n",
            "        [[ 0.5346, -0.4638, -0.1476,  ..., -0.1461, -0.2897,  0.0885],\n",
            "         [ 0.3012,  0.3939,  0.1502,  ..., -0.9472,  0.0169,  0.1088],\n",
            "         [ 0.3329, -0.0447,  0.3757,  ..., -0.5941,  0.4111,  0.5518],\n",
            "         ...,\n",
            "         [-0.8538, -0.3703,  0.1632,  ..., -0.5587, -0.2425,  0.3611],\n",
            "         [ 0.1513, -0.4870,  0.0311,  ..., -0.1869, -0.6509,  1.3294],\n",
            "         [-0.4406, -0.5038,  0.4541,  ..., -0.4605, -0.2537, -0.3636]],\n",
            "\n",
            "        [[ 1.0051, -0.4970, -0.5361,  ...,  0.0102, -0.2627, -0.2806],\n",
            "         [ 0.7463,  0.0111, -0.4338,  ..., -0.7396,  0.1446,  0.7116],\n",
            "         [-1.0110,  0.1770, -0.2314,  ..., -0.1498, -0.5213,  0.4980],\n",
            "         ...,\n",
            "         [ 0.4852,  0.5209, -0.1218,  ...,  0.7162, -0.0102,  0.5339],\n",
            "         [-0.5512, -0.2305,  1.2516,  ..., -0.4927, -0.2886,  0.3945],\n",
            "         [ 0.7476, -0.3242,  0.3596,  ...,  0.8673,  0.4849, -0.7631]]],\n",
            "       device='cuda:0')\n",
            "4bit Kern: tensor([[[ 0.3409,  0.1163, -0.0713,  ..., -1.2378, -0.2355, -0.9327],\n",
            "         [-0.3185, -0.3621, -0.7409,  ..., -0.1507, -1.1744,  0.5434],\n",
            "         [ 0.0519,  0.8851,  0.6816,  ...,  0.0752, -0.6438,  1.1702],\n",
            "         ...,\n",
            "         [ 0.2462,  1.5813, -0.5946,  ...,  0.4940, -0.2779, -0.1576],\n",
            "         [ 0.2529,  0.3197,  0.2337,  ...,  0.3987, -1.5286,  0.3646],\n",
            "         [-0.9714,  0.2794, -0.1031,  ..., -1.0960,  0.3168, -0.7899]],\n",
            "\n",
            "        [[-0.5303,  0.1734, -0.0591,  ...,  0.9764,  0.2096,  0.4706],\n",
            "         [ 0.2493,  0.4132,  0.7467,  ...,  0.3129, -0.6910,  0.2435],\n",
            "         [-0.4530,  0.2228, -0.2907,  ..., -0.2217,  0.2605,  0.2304],\n",
            "         ...,\n",
            "         [-0.9120, -0.5751, -0.1884,  ..., -1.0017,  0.3089,  1.2168],\n",
            "         [ 0.2820,  0.2496,  0.5855,  ..., -0.4431, -0.5884,  0.1653],\n",
            "         [ 1.4851, -0.3262,  0.6374,  ..., -0.3950,  0.1002,  0.2286]],\n",
            "\n",
            "        [[-0.6246, -0.3909,  0.6599,  ..., -0.1083,  0.4711,  0.2585],\n",
            "         [-0.8347,  0.8995, -0.0796,  ..., -0.5707, -0.2438, -0.0074],\n",
            "         [-0.5048, -1.6626,  0.7788,  ...,  0.2924, -0.2436, -0.0927],\n",
            "         ...,\n",
            "         [ 0.0874, -0.9244,  0.5524,  ..., -0.3129, -0.4003,  0.4855],\n",
            "         [-0.2516,  0.5097,  0.0160,  ..., -0.9318, -0.7483,  0.2749],\n",
            "         [-0.1661, -0.0682, -0.1576,  ..., -0.4610,  1.2449,  0.4009]],\n",
            "\n",
            "        [[ 0.5346, -0.4638, -0.1476,  ..., -0.1461, -0.2897,  0.0885],\n",
            "         [ 0.3012,  0.3939,  0.1502,  ..., -0.9472,  0.0169,  0.1089],\n",
            "         [ 0.3329, -0.0447,  0.3757,  ..., -0.5941,  0.4111,  0.5518],\n",
            "         ...,\n",
            "         [-0.8538, -0.3703,  0.1632,  ..., -0.5587, -0.2425,  0.3611],\n",
            "         [ 0.1513, -0.4870,  0.0311,  ..., -0.1869, -0.6509,  1.3294],\n",
            "         [-0.4406, -0.5038,  0.4541,  ..., -0.4605, -0.2537, -0.3636]],\n",
            "\n",
            "        [[ 1.0051, -0.4970, -0.5360,  ...,  0.0102, -0.2627, -0.2806],\n",
            "         [ 0.7463,  0.0111, -0.4338,  ..., -0.7396,  0.1446,  0.7116],\n",
            "         [-1.0110,  0.1770, -0.2314,  ..., -0.1498, -0.5213,  0.4980],\n",
            "         ...,\n",
            "         [ 0.4852,  0.5209, -0.1218,  ...,  0.7162, -0.0102,  0.5339],\n",
            "         [-0.5512, -0.2305,  1.2516,  ..., -0.4927, -0.2886,  0.3945],\n",
            "         [ 0.7476, -0.3242,  0.3596,  ...,  0.8673,  0.4849, -0.7631]]],\n",
            "       device='cuda:0')\n",
            "\n",
            "\n",
            "8bit Simu: tensor([[[-1.3612e-01,  2.3868e-01,  4.3228e-01,  ..., -9.5034e-02,\n",
            "           5.1928e-01, -8.7504e-02],\n",
            "         [ 1.5274e+00,  2.6957e-02,  5.1080e-02,  ..., -1.4028e+00,\n",
            "           5.3464e-01,  8.9787e-01],\n",
            "         [ 1.6875e-01, -1.9182e-01,  5.0604e-02,  ..., -1.5846e-01,\n",
            "          -6.8970e-02, -1.1131e+00],\n",
            "         ...,\n",
            "         [-7.7195e-01, -5.7850e-01, -3.6307e-02,  ...,  4.7196e-01,\n",
            "           3.4639e-01,  6.1058e-01],\n",
            "         [ 4.4703e-01,  4.9701e-01, -2.0531e-01,  ...,  2.2105e-01,\n",
            "           4.7335e-01, -2.4894e-01],\n",
            "         [-3.8796e-01, -8.5836e-01, -7.4272e-01,  ..., -2.3791e-01,\n",
            "           2.2564e-01, -1.5741e+00]],\n",
            "\n",
            "        [[ 5.9650e-01,  1.0593e+00,  4.9382e-01,  ...,  2.6581e-01,\n",
            "          -5.4062e-01,  5.6072e-01],\n",
            "         [-3.0853e-01,  4.3161e-02,  1.8714e-01,  ...,  1.9532e-01,\n",
            "           9.3350e-01,  7.8501e-01],\n",
            "         [ 2.4057e-02,  4.4674e-01,  1.7203e-01,  ...,  1.6688e-01,\n",
            "          -8.3594e-01, -5.1792e-01],\n",
            "         ...,\n",
            "         [-7.8548e-01,  6.2727e-02, -4.7613e-01,  ..., -9.6201e-01,\n",
            "          -2.4022e-01, -5.9171e-01],\n",
            "         [-8.7194e-02,  1.2760e-03, -4.3090e-01,  ..., -3.2074e-01,\n",
            "           1.0458e-01,  1.7414e-02],\n",
            "         [ 3.0700e-01,  5.2365e-02,  3.2642e-01,  ..., -1.0576e+00,\n",
            "          -4.6741e-01, -4.0205e-01]],\n",
            "\n",
            "        [[-1.9076e-01,  1.1169e-01, -1.4749e-01,  ...,  4.7497e-01,\n",
            "           6.0297e-01, -4.7164e-01],\n",
            "         [-1.2195e-01,  5.0959e-01, -6.5373e-02,  ...,  4.4890e-01,\n",
            "          -3.8201e-01,  1.3991e-01],\n",
            "         [ 6.2662e-01,  2.3795e-01, -5.5029e-01,  ..., -1.1801e+00,\n",
            "           5.3575e-03,  1.6566e-01],\n",
            "         ...,\n",
            "         [-4.3383e-01,  1.4268e-02,  3.2946e-01,  ...,  1.2408e+00,\n",
            "          -6.6904e-01, -8.4586e-01],\n",
            "         [ 4.4570e-01, -8.2566e-01,  2.8087e-01,  ...,  1.7394e-01,\n",
            "           5.2925e-01,  3.0300e-02],\n",
            "         [-3.1081e-01,  2.7897e-01,  9.4182e-02,  ..., -3.2118e-01,\n",
            "          -5.3960e-01,  2.3494e-01]],\n",
            "\n",
            "        [[ 4.0766e-01, -1.1317e+00, -4.5914e-01,  ..., -1.8660e-01,\n",
            "           6.7156e-02, -3.0988e-01],\n",
            "         [-3.6942e-01, -4.3924e-01,  7.9278e-02,  ...,  1.3519e-01,\n",
            "           1.4212e+00,  2.8826e-01],\n",
            "         [ 3.0997e-01, -3.3498e-01,  7.6541e-01,  ..., -1.4724e-01,\n",
            "           6.8804e-01, -3.2103e-01],\n",
            "         ...,\n",
            "         [-1.7104e-01, -7.6153e-01, -3.9197e-02,  ...,  5.8567e-01,\n",
            "          -8.9170e-01,  5.9681e-02],\n",
            "         [-8.1458e-02,  1.5885e+00, -4.2417e-01,  ...,  1.2265e+00,\n",
            "           5.4523e-01,  3.3200e-01],\n",
            "         [-1.4799e+00,  8.1969e-01, -3.0668e-01,  ..., -2.8946e-01,\n",
            "          -1.0448e-01, -4.6879e-01]],\n",
            "\n",
            "        [[ 2.6653e-01, -1.3003e+00, -2.8491e-01,  ...,  1.2587e-01,\n",
            "           4.9998e-01, -5.9461e-02],\n",
            "         [ 6.7636e-02,  5.9828e-01, -1.7459e-01,  ...,  9.9408e-01,\n",
            "           7.0125e-01, -5.5082e-02],\n",
            "         [-2.4253e-01, -5.9938e-01,  7.0244e-02,  ..., -8.2513e-01,\n",
            "           6.2857e-01,  5.8800e-01],\n",
            "         ...,\n",
            "         [ 1.1082e+00, -3.6440e-02, -5.4006e-02,  ..., -6.3404e-01,\n",
            "           1.1332e-01, -4.1249e-01],\n",
            "         [-1.0050e-02,  4.2679e-02, -5.2675e-01,  ...,  1.8629e-01,\n",
            "           1.1292e-01, -2.0926e-01],\n",
            "         [-6.6499e-01,  1.4787e+00, -3.0871e-01,  ...,  5.3211e-01,\n",
            "          -4.0761e-01, -8.2771e-01]]], device='cuda:0')\n",
            "8bit Kern: tensor([[[-1.3612e-01,  2.3868e-01,  4.3228e-01,  ..., -9.5034e-02,\n",
            "           5.1928e-01, -8.7505e-02],\n",
            "         [ 1.5274e+00,  2.6956e-02,  5.1080e-02,  ..., -1.4028e+00,\n",
            "           5.3464e-01,  8.9787e-01],\n",
            "         [ 1.6875e-01, -1.9182e-01,  5.0603e-02,  ..., -1.5846e-01,\n",
            "          -6.8970e-02, -1.1131e+00],\n",
            "         ...,\n",
            "         [-7.7195e-01, -5.7850e-01, -3.6307e-02,  ...,  4.7196e-01,\n",
            "           3.4639e-01,  6.1058e-01],\n",
            "         [ 4.4703e-01,  4.9701e-01, -2.0531e-01,  ...,  2.2105e-01,\n",
            "           4.7335e-01, -2.4894e-01],\n",
            "         [-3.8796e-01, -8.5836e-01, -7.4271e-01,  ..., -2.3791e-01,\n",
            "           2.2563e-01, -1.5741e+00]],\n",
            "\n",
            "        [[ 5.9650e-01,  1.0593e+00,  4.9382e-01,  ...,  2.6581e-01,\n",
            "          -5.4062e-01,  5.6072e-01],\n",
            "         [-3.0853e-01,  4.3161e-02,  1.8714e-01,  ...,  1.9532e-01,\n",
            "           9.3351e-01,  7.8501e-01],\n",
            "         [ 2.4057e-02,  4.4674e-01,  1.7202e-01,  ...,  1.6688e-01,\n",
            "          -8.3594e-01, -5.1792e-01],\n",
            "         ...,\n",
            "         [-7.8548e-01,  6.2727e-02, -4.7613e-01,  ..., -9.6201e-01,\n",
            "          -2.4022e-01, -5.9171e-01],\n",
            "         [-8.7194e-02,  1.2754e-03, -4.3090e-01,  ..., -3.2074e-01,\n",
            "           1.0458e-01,  1.7413e-02],\n",
            "         [ 3.0700e-01,  5.2365e-02,  3.2642e-01,  ..., -1.0576e+00,\n",
            "          -4.6741e-01, -4.0205e-01]],\n",
            "\n",
            "        [[-1.9076e-01,  1.1169e-01, -1.4749e-01,  ...,  4.7497e-01,\n",
            "           6.0297e-01, -4.7164e-01],\n",
            "         [-1.2195e-01,  5.0959e-01, -6.5373e-02,  ...,  4.4890e-01,\n",
            "          -3.8201e-01,  1.3991e-01],\n",
            "         [ 6.2662e-01,  2.3795e-01, -5.5029e-01,  ..., -1.1801e+00,\n",
            "           5.3578e-03,  1.6566e-01],\n",
            "         ...,\n",
            "         [-4.3383e-01,  1.4267e-02,  3.2946e-01,  ...,  1.2408e+00,\n",
            "          -6.6904e-01, -8.4586e-01],\n",
            "         [ 4.4570e-01, -8.2566e-01,  2.8087e-01,  ...,  1.7394e-01,\n",
            "           5.2925e-01,  3.0300e-02],\n",
            "         [-3.1081e-01,  2.7897e-01,  9.4182e-02,  ..., -3.2118e-01,\n",
            "          -5.3960e-01,  2.3494e-01]],\n",
            "\n",
            "        [[ 4.0766e-01, -1.1317e+00, -4.5914e-01,  ..., -1.8660e-01,\n",
            "           6.7156e-02, -3.0988e-01],\n",
            "         [-3.6942e-01, -4.3924e-01,  7.9277e-02,  ...,  1.3518e-01,\n",
            "           1.4212e+00,  2.8826e-01],\n",
            "         [ 3.0997e-01, -3.3498e-01,  7.6541e-01,  ..., -1.4724e-01,\n",
            "           6.8804e-01, -3.2103e-01],\n",
            "         ...,\n",
            "         [-1.7104e-01, -7.6153e-01, -3.9197e-02,  ...,  5.8567e-01,\n",
            "          -8.9170e-01,  5.9681e-02],\n",
            "         [-8.1458e-02,  1.5885e+00, -4.2417e-01,  ...,  1.2265e+00,\n",
            "           5.4523e-01,  3.3200e-01],\n",
            "         [-1.4799e+00,  8.1969e-01, -3.0668e-01,  ..., -2.8946e-01,\n",
            "          -1.0448e-01, -4.6879e-01]],\n",
            "\n",
            "        [[ 2.6653e-01, -1.3003e+00, -2.8491e-01,  ...,  1.2587e-01,\n",
            "           4.9998e-01, -5.9461e-02],\n",
            "         [ 6.7636e-02,  5.9828e-01, -1.7459e-01,  ...,  9.9408e-01,\n",
            "           7.0125e-01, -5.5082e-02],\n",
            "         [-2.4253e-01, -5.9938e-01,  7.0243e-02,  ..., -8.2513e-01,\n",
            "           6.2857e-01,  5.8800e-01],\n",
            "         ...,\n",
            "         [ 1.1082e+00, -3.6439e-02, -5.4006e-02,  ..., -6.3404e-01,\n",
            "           1.1332e-01, -4.1249e-01],\n",
            "         [-1.0049e-02,  4.2678e-02, -5.2676e-01,  ...,  1.8629e-01,\n",
            "           1.1292e-01, -2.0926e-01],\n",
            "         [-6.6499e-01,  1.4787e+00, -3.0871e-01,  ...,  5.3211e-01,\n",
            "          -4.0761e-01, -8.2771e-01]]], device='cuda:0')\n"
          ]
        }
      ],
      "source": [
        "! python setup_cuda.py install && CUDA_VISIBLE_DEVICES=0 && python test_kernel.py\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "b0gYQ0aFSVTH"
      },
      "source": [
        "### 将BELLE_BLOOM_GPTQ_4BIT版本下载到colab\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "7r0uotkjFjK8",
        "outputId": "2ab9cbe1-d8a7-4593-abea-de791ced63fb"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Updated git hooks.\n",
            "Git LFS initialized.\n",
            "Cloning into 'BELLE_BLOOM_GPTQ_4BIT'...\n",
            "remote: Enumerating objects: 18, done.\u001b[K\n",
            "remote: Counting objects: 100% (18/18), done.\u001b[K\n",
            "remote: Compressing objects: 100% (17/17), done.\u001b[K\n",
            "remote: Total 18 (delta 2), reused 0 (delta 0), pack-reused 0\u001b[K\n",
            "Unpacking objects: 100% (18/18), 4.08 MiB | 4.69 MiB/s, done.\n",
            "Encountered 1 file(s) that may not have been copied correctly on Windows:\n",
            "\tbloom7b-2m-4bit-128g.pt\n",
            "\n",
            "See: `git lfs help smudge` for more details.\n"
          ]
        }
      ],
      "source": [
        "\n",
        "!git lfs install && git clone https://huggingface.co/BelleGroup/BELLE_BLOOM_GPTQ_4BIT"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ouZZilIOHr5Y",
        "outputId": "9f9b0550-9e1e-4b8d-af3f-c758082f5c97"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "bloom7b-2m-4bit-128g.pt       README.md\t\t       tokenizer.json\n",
            "config.json\t\t      special_tokens_map.json\n",
            "pytorch_model.bin.index.json  tokenizer_config.json\n"
          ]
        }
      ],
      "source": [
        "!ls BELLE_BLOOM_GPTQ_4BIT"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "w27oK-ACsagp",
        "outputId": "69a75c4a-92a1-482a-fb94-700dc52f9691"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "/content/BELLE/gptq\n"
          ]
        }
      ],
      "source": [
        "!pwd"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "XuyRBoLm_ZXh"
      },
      "source": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "WXqexFzNfWwS"
      },
      "outputs": [],
      "source": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "OnPC00NyyNgH"
      },
      "source": [
        "##   运行下面的cell，然后在Human后面点击输入你和BELLE的对话内容。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "wno4FIAgZ8CI",
        "outputId": "9ab1b847-5242-4915-d498-54f3fb13295e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Loading model ...\n",
            "Done.\n",
            "Human:\n",
            "你是谁？\n",
            "Assistant:\n",
            "\n",
            "2023-04-13 10:26:28.343829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n",
            "也是一个问题，但是我的主人没有给我答案。</s>\n",
            "\n",
            "-------------------------------\n",
            "\n",
            "Human:\n",
            "怎么让自己变得精力充沛？\n",
            "Assistant:\n",
            "\n",
            "\n",
            "\n",
            "1. 睡眠时间充足。良好的睡眠可以让人的精力充沛，应该每天保证7-8小时的睡眠。\n",
            "2. 坚持运动。适量的运动可以促进身体的新陈代谢，让身体更加精力充沛。\n",
            "3. 均衡饮食。均衡的饮食可以为身体提供足够的能量和支持，避免感到疲乏。\n",
            "4. 避免压力。压力可以使人的精神状态不佳，应该采取一些方法来缓解压力。\n",
            "5. 建立良好习惯。建立一些良好的习惯可以让我们更加精力充沛，如每天坚持一些自己的爱好，规律的生活等等。</s>\n",
            "\n",
            "-------------------------------\n",
            "\n",
            "Human:\n",
            "写一首歌颂程序员的诗。\n",
            "Assistant:\n",
            "\n",
            "\n",
            "\n",
            "代码海洋漫无边，\n",
            "代码海洋深不可测。\n",
            "逻辑思路如天网，\n",
            "程序员们日以继夜。\n",
            "\n",
            "他们喜欢把问题简化，\n",
            "却让复杂变简单。\n",
            "他们的想象力无限，\n",
            "解决难题难题。\n",
            "\n",
            "程序员们是英雄，\n",
            "是人类智慧的源泉。\n",
            "让我们珍惜他们的努力，\n",
            "珍惜他们带来的成果。</s>\n",
            "\n",
            "-------------------------------\n",
            "\n",
            "Human:\n",
            "\n",
            "^C\n"
          ]
        }
      ],
      "source": [
        "! python bloom_inference.py BELLE_BLOOM_GPTQ_4BIT  --temperature 1.2  --wbits 4 --groupsize 128 --load  BELLE_BLOOM_GPTQ_4BIT/bloom7b-2m-4bit-128g.pt"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Xb5qp30PdB5Y"
      },
      "outputs": [],
      "source": [
        " "
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "provenance": [],
      "toc_visible": true
    },
    "gpuClass": "standard",
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
